The Bane of Low-Dimensionality Clustering
نویسندگان
چکیده
In this paper, we give a conditional lower bound of n on running time for the classic k-median and k-means clustering objectives (where n is the size of the input), even in lowdimensional Euclidean space of dimension four, assuming the Exponential Time Hypothesis (ETH). We also consider k-median (and k-means) with penalties where each point need not be assigned to a center, in which case it must pay a penalty, and extend our lower bound to at least three-dimensional Euclidean space. This stands in stark contrast to many other geometric problems such as the traveling salesman problem, or computing an independent set of unit spheres. While these problems benefit from the so-called (limited) blessing of dimensionality, as they can be solved in time n ) or 2 1−1/d in d dimensions, our work shows that widely-used clustering objectives have a lower bound of n, even in dimension four. We complete the picture by considering the two-dimensional case: we show that there is no algorithm that solves the penalized version in time less than n √ , and provide a matching upper bound of n √ . The main tool we use to establish these lower bounds is the placement of points on the moment curve, which takes its inspiration from constructions of point sets yielding Delaunay complexes of high complexity. ∗The project leading to this application has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 748094. The work of A. de Mesmay is partially supported by the French ANR project ANR-16-CE40-0009-01 (GATO). The work of A. Roytman is partially supported by Thorup’s Advanced Grant DFF-0602-02499B from the Danish Council for Independent Research.
منابع مشابه
Steel Consumption Forecasting Using Nonlinear Pattern Recognition Model Based on Self-Organizing Maps
Steel consumption is a critical factor affecting pricing decisions and a key element to achieve sustainable industrial development. Forecasting future trends of steel consumption based on analysis of nonlinear patterns using artificial intelligence (AI) techniques is the main purpose of this paper. Because there are several features affecting target variable which make the analysis of relations...
متن کاملEvaluation of Physico-Mechanical and Antimicrobial Properties of Gelatin- Carboxymethyl Cellulose Film Containing Essential Oil of Bane (Pistacia atlantica)
Background and Objectives: Microbial activity is the main factor in spoiling food products, which not only changes their texture and taste but also causes economic damage and poisoning. The present study aimed to assess the effects of essential oil of Bane (Pistacia atlantica) on physico-mechanical and antimicrobial properties of gelatin- carboxymethyl cellulose film. Materials and Methods: ...
متن کاملEffective Term Based Text Clustering Algorithms
Text clustering methods can be used to group large sets of text documents. Most of the text clustering methods do not address the problems of text clustering such as very high dimensionality of the data and understandability of the clustering descriptions. In this paper, a frequent term based approach of clustering has been introduced; it provides a natural way of reducing a large dimensionalit...
متن کاملFuzzy clustering of time series data: A particle swarm optimization approach
With rapid development in information gathering technologies and access to large amounts of data, we always require methods for data analyzing and extracting useful information from large raw dataset and data mining is an important method for solving this problem. Clustering analysis as the most commonly used function of data mining, has attracted many researchers in computer science. Because o...
متن کاملA Nonlinear Dimensionality Reduction Using Combined Approach to Feature Space Decomposition
In this paper we propose a new combined approach to feature space decomposition to improve the efficiency of the nonlinear dimensionality reduction method. The approach performs the decomposition of the original multidimensional space, taking into account the configuration of objects in the target low-dimensional space. The proposed approach is compared to the approach using hierarchical cluste...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018